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HKU1 is a human betacoronavirus that causes mild yet prevalent 
respiratory disease 1 , and is related to the zoonotic SARS 2 and 
MERS 3 betacoronaviruses, which have high fatality rates and 
pandemic potential. Cell tropism and host range is determined 
in part by the coronavirus spike (S) protein 4 , which binds cellular 
receptors and mediates membrane fusion. As the largest known 
class I fusion protein, its size and extensive glycosylation have 
hindered structural studies of the full ectodomain, thus preventing 
a molecular understanding of its function and limiting development 

o 

of effective interventions. Here we present the 4.0 A resolution 
structure of the trimeric HKU1 S protein determined using single¬ 
particle cryo-electron microscopy. In the pre-fusion conformation, 
the receptor-binding subunits, SI, rest above the fusion-mediating 
subunits, S2, preventing their conformational rearrangement. 
Surprisingly, the SI C-terminal domains are interdigitated and form 
extensive quaternary interactions that occlude surfaces known in 
other coronaviruses to bind protein receptors. These features, along 
with the location of the two protease sites known to be important for 
coronavirus entry, provide a structural basis to support a model of 
membrane fusion mediated by progressive S protein destabilization 
through receptor binding and proteolytic cleavage. These studies 
should also serve as a foundation for the structure-based design of 
betacoronavirus vaccine immunogens. 

Betacoronavirus S proteins are processed into SI and S2 subunits 
by host proteases 5 . Like other class I viral fusion proteins, the two 
subunits trimerize and fold into a metastable pre-fusion conforma¬ 
tion. The SI subunit is responsible for receptor binding, while the S2 
subunit mediates membrane fusion. Coronaviruses typically possess 
two domains within SI capable of binding to host receptors: an amino 
(N)-terminal domain (NTD) and a carboxy (C)-terminal domain 
(CTD), with the latter recognizing protein receptors for SARS-CoV 
and MERS-CoV 6,7 . Although these individual domains have been 
structurally characterized, the organization of the complete spike has 
not yet been determined, preventing a mechanistic understanding of 
S protein function. 

Here, we present the structure of the HKU1 S protein ectodomain 

o 

determined using cryo-electron microscopy (cryo-EM) to 4.0 A res¬ 
olution (Fig. la and Extended Data Figs 1 and 2 and Extended Data 
Table 1). The protein construct contains a C-terminal T4 fibritin tri- 
merization motif and a mutated S1/S2 furin-cleavage site (Extended 
Data Fig. 3). The SI subunit adopts an extended conformation with 
short linkers between domains and sub-domains (Fig. lb). The SI NTD 
(amino acids 14-297) has strong structural and sequence homology to 
the bovine coronavirus (BCoV) SI NTD (Extended Data Fig. 4), which 
recognizes acetylated sialic acids on glycosylated cell-surface receptors 8 . 
The glycan-binding site in the BCoV SI NTD is conserved in the HKU1 
SI NTD and is located at the apex of the trimer, oriented towards target 
cells. Indeed, HKU1 SI was recently shown to bind O-acetylated sialic 


acids on host cells, and these glycans were required for efficient infec¬ 
tion of primary human airway epithelial cultures 9 . 

The HKU1 SI CTD (amino acids 325-605) consists of a structurally 
conserved core connected to a large, variable loop (HKU 1 S amino 
acids 428-587) 10 that is partially disordered (Extended Data Figs 5 
and 6). The CTD is located at the trimer apex close to the threefold 
axis, and the core interacts with the other two SI CTD cores and with 
one NTD from an adjacent protomer. The domain swapping between 
protomers results in a woven appearance when viewed looking down 
towards the viral membrane (Fig. 2a). Structural alignment of the 
SARS-CoV and MERS-CoV CTD-receptor complexes 11,12 with the 
HKU 1 pre-fusion S protein reveals that the protein-receptor-binding 
surface of the SI CTD is buried in the HKU1 S protein trimer and is 
therefore incapable of making equivalent interactions without some 
initial breathing and transient exposure of these domains (Fig. 2b). 
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Figure 1 | Structure of the HKU1 pre-fusion spike ectodomain. 

a, A single protomer of the trimeric S protein is shown in cartoon 
representation coloured as a rainbow from the N to C terminus (blue to 
red) with the reconstructed EM density of remaining protomers shown 
in white and grey, b, The SI subunit is composed of the NTD and CTD 
as well as two sub-domains (SD-1 and SD-2). The S2 subunit contains 
the coronavirus fusion machinery and is primarily a-helical. c, Domain 
architecture of the HKU 1 S protein coloured as in a. 
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Figure 2 | Architecture of the HKU1 SI subunit, a, EM density 
corresponding to each SI protomer is shown. The putative glycan-binding 
and protein-receptor-binding sites are indicated with dashed shapes on 
the NTD and CTD, respectively b, The HKU1 SI CTD forms quaternary 
interactions with an adjacent CTD using a surface similar to that used 
by SARS CTD to bind its receptor, ACE2 (ref. 11). c, Sub-domain 1 is 
composed of amino acid residues before and after the SI CTD. d, Sub- 
domain 2 is composed of SI sequence C-terminal to the CTD, a short 
peptide following the NTD, and the N-terminal strand of S2, which follows 
the S1/S2 furin-cleavage site. 


Although a protein receptor has not yet been identified for HKU1, 
antibodies against the CTD, but not those against the NTD, blocked 
HKU1 infection of cells 13 . These data suggest that the SI CTD is the 
primary HKU1 receptor-binding site 13 , whereas the NTD mediates 
initial attachment via glycan binding. 

HKU1 SI also contains two sub-domains (which we term SD-1 and 
SD-2) that lack significant homology to previously determined struc¬ 
tures (Fig. 2c, d). These sub-domains are primarily composed of SI 
amino acid sequences following the CTD. However, stretches of amino 
acids preceding the CTD as well as S2 residues adjacent to the S1/S2 
cleavage site also contribute to the sub-domains. This complex folding 
of elements dispersed throughout the primary sequence may allow 
receptor-induced conformational changes in the CTD to be transmit¬ 
ted to other parts of the structure. 

In contrast to other viral fusion proteins such as influenza haemag- 
glutinin (HA) 14 or HIV-1 envelope (Env) 15,16 , the HKU1 SI subunits are 
rotated about the trimeric threefold axis with respect to the S2 subunits, 
causing the SI subunit from one protomer to sit above the S2 subunit 
of an adjacent protomer (Extended Data Fig. 7). Similar to HA and 
Env, a region in the HKU1 SI CTD (amino acids 371-380) caps the S2 
central helix, thereby preventing the fusion machinery from springing 
into action. 

Processing of coronavirus S proteins by host proteases plays a critical 
role in the entry process 5 . HKU1 S is cleaved by furin into SI and S2 
subunits during protein biosynthesis. Though mutated in the protein 
construct used here and disordered in the density map, the HKU1 S 
furin-cleavage site at the S1/S2 junction lies in a loop of SD-2 (Fig. 3 
and Extended Data Fig. 6). Furin cleavage would leave a single S2 
(3-strand participating in the SD-2 (3-sheets (Fig. 2d). Coronavirus S 
proteins also have a secondary cleavage site, termed S2' (Arg900) 5 , 



Figure 3 | HKU1 S2 subunit fusion machinery, a, The HKU1 S2 subunit 
is coloured like a rainbow from the N-terminal (3-strand (blue), which 
participates in SI sub-domain 2, to the C terminus (red) before HR2. 
b, The HKU1 S2 structure contains the fusion peptide (FP) and a heptad 
repeat (HR1). Protease-recognition sites are indicated within disordered 
regions of the protein (dashed lines), c, A comparison of coronavirus S2 
HR1 in the pre- and post-fusion 22 conformations. Five HR1 a-helices are 
labelled and coloured like a rainbow from blue to red, N to C terminus, 
respectively. The structures are oriented to position similar portions of the 
central helix (red). 


adjacent to the viral fusion peptide (amino acids 901-918) 17 (Fig. 3b 
and Extended Data Fig. 6). This is similar to the multiple endoprote- 
olytic cleavage events that occur in the fusion proteins of respiratory 
syncytial virus (RSV) and Ebola virus 18,19 . Protease cleavage at S2 7 likely 
follows S1/S2 cleavage and may not occur until host-receptor engage¬ 
ment at the plasma membrane or viral endocytosis 5 . 

As in all class I viral fusion proteins, the coronavirus S2 subunit con¬ 
tains the four elements required for membrane fusion: a fusion peptide 
or loop, two heptad repeats (HR1 and HR2), and a transmembrane 
domain 14,20,21 . Refolding of HR1 into a long a-helix thrusts the fusion 
peptide into the host-cell membrane, and as the two heptad repeats 
interact to form a coiled-coil, the host and viral membranes are brought 
together. The fusion peptide, conserved among coronavirus S proteins 17 
(Extended Data Fig. 6), is located on the exterior of the HKU1 S pro¬ 
tein and is adjacent to the putative S2 ; cleavage site, which remains 
uncleaved in our structure. The fusion peptide forms a short helix and 
a loop, with most of the hydrophobic amino acids buried in an interface 
with other elements of S2. Unlike influenza HA where the C terminus of 
the fusion peptide is only 14 amino acids away from the N terminus of 
HR1, the fusion peptide of HKU 1 S is 60 amino acids away from HR1. 
This span of protein contains four short a-helices and several longer 
regions lacking regular secondary structure. This intervening sequence 
is also buried beneath SD-2 and the S2 7 cleavage site, suggesting that 
cleavage may affect the proclivity of S2 for undergoing the transition 
to the post-fusion conformation. 

Coronavirus S protein heptad repeats are unusually large with HR1 
encompassing more than 90 amino acids 20 . In the cryo-EM structure, 
HR2 is located at the base of the HKU 1 S protein near the viral mem¬ 
brane, but is poorly ordered, precluding unambiguous assignment of 
the residues. However, HR1 is well ordered and arranged along the 
length of the S2 subunit, forming four short helices and part of the 
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Figure 4 | Comparison of structurally related class I viral fusion proteins. The fusion proteins from coronaviruses, influenza virus and HIV- 1 are 
cleaved into receptor-binding subunits (pink, light green, light blue) and the viral fusion machinery (dark red, dark green, blue) 14-16,28 . Comparison to 
other class I fusion proteins can be found in Extended Data Fig. 8. 


central three-helix bundle. This arrangement of HR1 is similar to that 
of influenza HA, although in HA the HR1 is organized as two helices 
connected by a long loop 14 . Conversion of influenza HA to the post¬ 
fusion conformation requires these protein elements to transition into 
a single long a-helix 21 . The post-fusion six-helix bundle structures of 
SARS-CoV and MERS-CoV S2 heptad repeats 22,23 reveal that corona- 
virus S proteins also undergo a similar transition (Fig. 3c). However, 
the S protein must carry out five such loop-to-helix transitions, high¬ 
lighting the complexity of S proteins relative to other class I fusion 
proteins. In addition, the membrane distal regions of the pre-fusion S2 
central three-helix bundle (S2 amino acids 1070-1076), which is the 
C-terminal portion of HR1, are splayed outwards from the threefold 
axis (Extended Data Fig. 7). In the available coronavirus post-fusion 
HR1-HR2 structures, this portion of HR1 forms a tight three-helix 
bundle 22,23 . Formation of this three-helix bundle may be prevented 
by interactions between the C-terminal end of the S2 HR1 and the 
SI CTD, and thus disruption of these interactions through receptor- 
induced conformational changes would provide an additional means 
by which receptor binding in SI can initiate S2-mediated membrane 
fusion. Indeed, protease cleavage and an acidic pH are thought to be 
insufficient to trigger the transition to the post-fusion conformation 
without additional destabilization provided by receptor binding 24-26 . 

The formation of anti-parallel six-helix bundles composed of HR1 
and HR2 in the post-fusion conformation is a unifying feature of class I 
viral fusion proteins. However, the pre-fusion conformations of this 
protein family are incredibly diverse in size and topology (Extended 
Data Fig. 8). The HKU1 S protein structure presented here most closely 
resembles influenza virus HA and HIV-1 Env (Fig. 4), which also have 
receptor-binding subunits that cap the central helix of the fusion sub¬ 
unit 14,15,27,28 . However, some core elements of the fusion machinery are 
conserved amongst all class I fusion proteins, including paramyxovirus 
F proteins. 

The HCoV-HKUl S protein trimer in a pre-fusion conformation is, 
to our knowledge, the largest class I viral fusion glycoprotein structure 
determined to date (Fig. 4 and Extended Data Figs 8 and 9). Since 
betacoronavirus S proteins are similar in size and have a conserved 
domain organization, our findings should be generally applicable 
to other betacoronaviruses, including SARS-CoV and MERS-CoV 
(Extended Data Fig. 6). Our studies provide a structural basis for S pro¬ 
tein function wherein the pre-fusion S protein is progressively matured 
and destabilized by receptor binding and protease cleavage. Following 
dissociation of the SI subunits, HR1 would transition to a long a-he- 
lix, and the fusion peptide would be released from the side of the S2 
subunit and inserted into host membranes. The structure and mecha¬ 
nistic insights presented here should enable engineering of pre-fusion 


stabilized coronavirus S proteins as vaccine immunogens against cur¬ 
rent and emerging betacoronaviruses, similar to recent efforts for other 
viral fusion proteins 29,30 . This work also acts as a springboard for future 
studies to define mechanisms of antibody recognition and neutrali¬ 
zation, which will lead to an improved understanding of coronavirus 
immunity. 

Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Data reporting. No statistical methods were used to predetermine sample size. 
The investigators were not blinded to allocation during experiments and outcome 
assessment. 

Protein expression and purification. A mammalian-codon-optimized gene 
encoding HKU1 S (isolate N5, NCBI accession Q0ZME7) residues 1-1276 with 
a C-terminal T4 fibritin trimerization domain, a HRV3C cleavage site, and a 
6xHis-tag was synthesized and subcloned into the eukaryotic expression vec¬ 
tor pVRC8400. The S1/S2 furin-recognition site 752-RRKRR-756 was mutated 
to GGSGS to generate the uncleaved construct used for cryoEM studies. Three 
hours after this plasmid was transfected into FreeStyle 293-F cells (Invitrogen), 
kifunensine was added to a final concentration of 5 pM. FreeStyle 293-F cells are a 
high-transfection-efficiency cell line adapted for suspension culture derived from 
low passage clonal cultures and after purchase were not further authenticated. 
Cells were not confirmed to be free of mycoplasma, but were only used for pro¬ 
tein expression. Cultures were harvested after six days, and protein was purified 
from the medium using Ni-NTA Superflow resin (Qiagen). The buffer was then 
exchanged using a HiPrep 26/10 desalting column (GE Healthcare Biosciences) 
from a high-imidazole elution buffer to a low pH buffer (20 mM Bis-Tris pH 6.5, 
150 mM NaCl). Afterward, endoglycosidase H (EndoH) (10% w/w) and HRV3C 
protease (1% w/w) were added to the protein and the reaction was incubated over¬ 
night at 4°C. The digested protein was further purified using a Superose 6 16/70 
column (GE Healthcare Biosciences). 

The furin-cleaved HKU1 S construct analysed by negative-stain EM was similar 
to the one described above except that it encoded residues 1-1249 and contained 
the wild-type RRKRR furin-recognition site. Expression and purification were also 
similar, except that a plasmid expressing furin was co-transfected into the FreeStyle 
293-F cells to ensure complete processing of the protein. 

Sample preparation for negative-stain electron microscopy. HKU1 S proteins 
were placed directly onto 400 copper mesh grids and then stained with 1% uranyl 
formate. Tris-buffered saline (TBS) was used as buffer if dilution was necessary. 
Negative-stain electron microscopy data collection. Grids were loaded into a 
Tecnai T12 Spirit operating at 120 keV and imaged using a Tietz TemCam-F416 
CMOS at 52,000 x magnification at ~ 1.5 pm under focus. Micrographs were 
collected using Leginon 31 and processed within Appion 32 . Particles were picked 
using a difference-of-Gaussians approach 33 and aligned using reference-free 2D 
classification employing iterative multivariate statistical analysis/multi-reference 
alignment (MRA/MSA) using a binning factor of 2 to remove amorphous parti¬ 
cles 34 . Particles in classes that did not represent views of HKU 1 S proteins were 
discarded. ISAC 35 was used to generate a template stack from which initial 3D 
models were generated using the EMAN2 (ref. 36) procedure initialmodel.py. 3D 
models were refined using EMAN1 (ref. 37). 

Sample preparation for cryo-electron microscopy. Sample solution (3 pi) was 

applied to the carbon face of a CF-2/2-4C C-Flat grid (Electron Microscopy 
Sciences, Protochips) that had been plasma cleaned for five seconds using a mix¬ 
ture of Ar/C >2 (Gatan Solarus 950 Plasma system). The grid was then manually 
blotted and immediately plunged into liquid ethane using a manual freeze plunger. 
Cryo-electron microscopy data collection. Movies were collected via the 
Leginon interface on a FEI Titan Krios operating at 300 keV mounted with a 
Gatan K2 direct-electron detector 31 . Each movie was collected in counting mode 
at 22,500 x nominal magnification resulting in a calibrated pixel size of 1.31 A/pix 
at the object level. A dose rate of <~10 e“/((cam pix) x s) was used; exposure time 
was 200 ms per frame. The data collection resulted in a total of 1,049 movies con¬ 
taining 50 frames each. Total dose per movie was 57 e“/A 2 . Data were collected at 
1.0 to 3.5 pm under focus. 

Cryo-electron microscopy data processing. Frames in each movie were aligned 38 , 
and CTF estimation was carried out using CTFFIND3 (ref. 39). Particles were 
picked from a subset of the data employing a difference-of-Gaussians approach 33 
and aligned using reference-free 2D classification employing iterative MRA/MSA 
using a binning factor of two 34 . The resulting 2,188 particles were used to generate 
an initial 25 A lowpass-filtered 3D reconstruction using EMAN2. SPIDER refproj. 
spi 40 with a delta theta angle of 15 degrees was used to generate 83 projection 
images of the initial 3D reconstruction. These projection images were used as 
templates for picking particles from the entire cryo data set. Particles from the 
entire data set were aligned and classified with the same methods used for the 
subset of particles stated above. After 2D classification, unbinned selected particles 
were symmetrically refined in RELION version 1.3 (refs 41,42) against the initial 
3D reconstruction filtered to 60 A resolution. This refinement was followed by 
particle polishing and refinement of the resulting realigned, B-factor-weighted 
and signal-integrated particles using RELION version 1.4bl. The resolution of the 
final map was 4.04 A at an FSC cutoff of 0.143. A mask was generated in RELION 
using a threshold that accounted for the entire structure. From this threshold, the 
mask was further dilated by 3 voxels and a Gaussian fall-off was generated over an 


additional 6 voxels. The mask effect on FSC was taken into consideration. Phases 
were randomized in the unfiltered half-set maps for initial FSC lower than 0.8 
and a new FSC between these phase-randomized maps was generated and used to 
correct for mask effects in the final FSC-based resolution estimate. The reported 
resolution of 4.04 A is the RELION CorrelationCorrected value 

The map was B-factor sharpened employing FSC-weighting. The B-factor 
was estimated in RELION based on the resolution range from 10 A to 2.62 A 
(B-factor =—117 A 2 ). The detector MTF file was provided to RELION. 

Model building and refinement. An initial model of the SI NTD was generated 
using the Modeller 43 homology modelling tool in UCSF Chimera 44 with the BCoV 
NTD (PDB 4H14) 8 as a template. The NTD homology model was docked into the 
HKU1 S protein EM density and refined with Rosetta density-guided iterative local 
refinement 45 while imposing C3 symmetry. Rosetta output models were clustered 
based on pairwise r.m.s.d. using a cluster radius of 2.15 A. The lowest energy model 
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Extended Data Figure 1 | Data processing flowchart, a, Processing 
resulting in density map of pre-fusion HKU1 spike glycoprotein at 4.04 A 
resolution, b, FSC plot illustrating correlation between two volumes 
refined independently from two distinct half sets of raw data. A final 


reference: 




1 

projection image polishing 

i 

3D refinement 
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(C3 symmetry) 


b 



c 



resolution of 4.04 A is indicated in the plot, c, Angular distribution of 
raw data within the data set. A slight, but within normal range, over¬ 
representation of top views was observed (tall red bars). 
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Extended Data Figure 2 | Resolution of the pre-fusion HKU1 S density 
map. a, Local resolution within the EM density map. Local resolution 
was calculated using ResMap 51 discretizing every 0.25 A over a range 
from 2 x voxel size (2.62 A) to 4 x voxel size (5.24 A). Resolution 
significance criterion was set to 0.05. The resolution ranges from 3.74 A 


in stable internal secondary structures to greater than 5.00 A in flexible 
peripheral loops, b, Close-ups of secondary-structure densities. To the 
left is displayed the central a-helix of an S2 monomer and to the right is 
[3-sheet from the NTD domain in an SI monomer. 
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Wild-type S1/S2 Cleavage Site + 

Foldon trimerization domain + 

Extended Data Figure 3 | Cleavage at the S1/S2 junction does not 
induce large conformational changes in HKU1 spike, a, HKU1 spike 
1-1249 with an attached foldon domain and wild-type furin-cleavage site 
was reconstructed using negative-stain electron microscopy, b, HKU1 



+ 

spike 1-1276 with an attached foldon and a mutated furin-cleavage site 
reconstructed using negative-stain electron microscopy, c, HKU1 spike 
1-1249 without foldon and with mutated furin-cleavage site. Side and top 
views are shown. 


© 2016 Macmillan Publishers Limited. All rights reserved 








RESEARCH 


LETTER 


a 




Extended Data Figure 4 | Putative glycan binding site of the HKU1 
SI NTD. a, HKU1 trimeric S and b, an isolated monomer. Putative host 
glycan-binding and protein-receptor-binding sites are indicated, c, The 
bovine coronavirus (BCoV) SI NTD structure from Peng et al 8 (teal) 
is superposed onto the HKU1 S NTD (pink). Residue side-chains 


involved in the putative glycan-binding site (dashed circle) are shown 
as sticks, with oxygen atoms coloured red and nitrogen atoms coloured 
blue. Note that N198 (BCoV) and N188 (HKU1) are predicted N-linked 
glycosylation sites. 
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Extended Data Figure 5 | Betacoronavirus S proteins possess a 
conserved structural core in their C-terminal domains, a, The 

structurally divergent loop of the SI CTD is poorly ordered distal to the 
core CTD domain. The conserved SI CTD cores 10 of b, HKUl-CoV 
highlighted in the trimeric pre-fusion S, c, HKUl-CoV as an isolated 
domain, d, MERS-CoV 12 and e, SARS-CoV 11 are coloured according 
to secondary structure ([3-sheets: pink, a-helices: blue, lacking regular 


secondary structure: grey) and the insert which differs amongst 
coronaviruses is coloured yellow. Atoms participating in quaternary 
interactions with other HKU1 S protomer CTDs are shown in green 
surface in c. f, The positions of these interacting atoms are mapped on to 
the conserved core topology. The sheet and helix nomenclature is taken 
from reference 10. 


© 2016 Macmillan Publishers Limited. All rights reserved 











































RESEARCH 


LETTER 


NTD 

HKU1 -MFLIIFILPTTLAV—IGD-FNCTNSF--INDYNKTIPRISEDWDVSLGLGTYYVLNRVYLNTTLLFTGYFPKSGANFRDLALKGSIY-LSTLWYKPPFLSDFNN 100 

SARS -MFIFLLFLTLTS-G—SDLDRCTTFD-DVQAPNYT-QHTSSMRGVYYPDEIFRSDTLYLTQDLFLPFYSNVTGFHT-INHTFGN-PVIPFKD 85 

MERS MIHSVFLLMFLLTPTESYVDVGPDSVKSACIEVDIQQTFFDKTWPR-P-IDVSKADGIIYPQGRTYSNITITYQG-LFPYQGDHGDMYVYSAGHATGTTPQKLFVANYSQ-DVKQFAN 114 

• ★•••••★★ k • k k k k • • •• • • k 

HKU1 GIFSKVKNTKLYVNN-TLYSEFSTIVIGSVFVNTSYTIV—VQPHNGILEITACQYTMCEYPHTVCKSKGSIRNE-SWHIDSSE-PLCLFKKNFTYNVSAD 196 

SARS GIYFAATEK-SNWRGWVFGSTMNNKSQSVI — IINNSTNWIRACNFELCDNPFFAVSKPMGTQTHTMIFDNAFNCTFEYISDAF—SLDVSEKSGNFKHL— 182 

MERS GFWRIGAAANSTGTVIISPSTSATIRKIYPAFMLGSSVGNFSDGKMGRFFNHTLVLLPDGCGTL—LRAFYCILEPRSGNHCP—AGNSYTSFATYHTPATDCS DGNYNRNAS LNS F— 228 

* • • k • ... 

• . ••• ...... . . . . •• ••••• 


HKU1 

SARS 

MERS 


HKU1 

SARS 

MERS 


HKU1 

SARS 

MERS 


HKU1 

SARS 

MERS 


WLYFHFYQERGVFYAYY-ADV-GMPTTFLFSLYLGTILSHYYVMPLTCNAISSNTD-NETLEYWVTPLSRRQYLLNFDEHGVITNAVDCSSSF 286 

-REFVFKNKDGFLYVYKGYQPID-WRDLPSGFNTLKP-IFKLPLGINITNFRAI-LTAFSPAQDIWGTSAAAYFVGYLKPTTFMLKYDENGTITDAVDCSQNP 282 

-KEYFNLRNCTFMYTYNITEDEILEWFGITQTAQGVHLFSSRYVDLYGGNMFQFATLPVYDTIKYYSIIPHSIRSIQSDRKAW-AAFYVYKLQPLTFLLDFSVDGYIRRAIDCGFND 343 


CTD 


LSEIQCKTQSFAPNTGVYDLSGFTVKPVATVYRRIPNLPDCDIDNWLNNVSVPSPLNWERRIFSNCNFNLSTLLRLVHVDSFSCNNLDKSKIFGSCFNSITVDKFAIPNRRRDDLQLGSS 

LAELKCSVKSFEIDKGIYQTSNFRWPSGDWRFPNITNLCPFGEVFNATKFPSVYAWERKKISNCVADYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFVVKGDDVRQIAPGQT 

LSQLHCSYESFDVESGVYSVSSFEAKPSGSWEQAEG-VECDFSPLLS-GTPPQVYNFKRLVFTNCNYNLTKLLSLFSVNDFTCSQISPAAIASNCYSSLILDYFSYPLSMKSDLGVSSA 


406 

402 

461 


Variable Loop 

GFLQSSNYKIDISSSSCQLYYSLPLVNVTINNFNPSSWNRRYGFGSFNLSSYDWYSDHCFSVNS-DFCPCADPSVVNSCAKSKPPSAICPAGTKYRHCDLDTTLYVK 513 

GVIADYNYKLPDDFMGCVLAWNTRNIDATSTGN-YNYKY-RYLRHG—KLRPFERDISNVPFSPDGKPCTPPA-LNCYWPLNDYG- 4 82 

GPISQFNYKQSFSNPTCLILATVPHNLTTITKP-LKY-SYINKCSRLLSDDRTEVPQLVNANQYSPCVSIV-PSTVWEDGDYYRKQL—SPLEGG 551 

★ • ★ ★ ★ k • • k k 

NWCRCSCLPDPISTYSPNTCPQKKWVGIGEHCPGLGINEEKCGTQLNHSSCFCSPDAFLGWSFDSCISNNRCNIFSNFIFNGINSGTTCSNDLL—YSNTEISTGVCVNYDLYGITGQG 631 

-FYTTTG-IG-YQPYRVW-LSFELL-NAPATVCGP-KLSTDLIKNQCVNFNFNGLTGTG 536 

GWLVASG-ST-VAMTEQLQ-MGFGITVQYGTDTNSVCPKLEFANDTKIASQLGNCVEYSLYGVSGRG 615 


HKU1 IFKEVSAAYYNNWQNLLYDSNGNIIGF-KDFLTNKTYTILPCYSGRVSAAFY—QN S S S PALL YRNLKC S YVLNNIS FIS Q-PFYFDSYLGCVLNAVNLTSYSVS 732 

SARS VLTPSSKRF-QPFQQFGRD-VSDFTDSVRDPKTSEILDISPCAFGGVSVITPGTNASSEVAVLYQDVNCTDVSTAIHADQLT—PAWRIYSTGNNVFQTQAGCLIGAEHVD—TSY 646 

MERS VFQNCTAVG-VRQQRFVYDAYQNLVGYYSD-DGNYYCLRACVSVPVSVIYD—KETKTHATLFGSVACEHISSTMSQYSRSTRSMLKRRDSTYGPLQTPVGCVLGLVNSS-LFVE 725 


HKU1 

SARS 

MERS 


HKU1 

SARS 

MERS 


S1/S2 V s2 

SCDLRMGSGFCIDYALPSSRRKRRGISSPYR—FVTFEPFNVSFVNDSVETVGGLFEIQIPTNFTIAGHEEFIQTSSPKVTIDCSAFVCSNYAACHDLLSEYGTFCDNINSILNEVNDLL 850 

ECDIPIGAGICASYHTVSLLRSTSQKSIVA-YTMSLGA-DSSIAYSNNTIAIPTNFSISITTEVMPVSMAKTSVDCNMYICGDSTECANLLLQYGSFCTQLNRALSGIAAEQ 756 

DCKLPLGQSLCALPDTPSTLTPRSVRSVPGEMRLASI-AFNHPIQV—DQFNSSYFKLSIPTNFSFGVTQEYIQTTIQKVTVDCKQYICNGFQKCEQLLREYGQFCSKINQALHGANLRQ 842 


S2 ,V 


FP 


DITQLQVANALMQGVTLSSNLNTNLHSDVDNIDFKSLLGCLGSQCGSSSRSLLEDLLFNKVKLSDVGFVEAYNNCT—GGSEIRDLLCVQSFNGIKVLPPILSETQISGYTTAATVAAMF 

DRNTREVFAQVKQMY-KTPTLKYFGG-FNFSQILPDPL KPTKRSFIEDLLFNKVTLADAGFMKQYGECL—GDINARDLICAQKFNGLTVLPPLLTDDMIAAYTAALVSGTAT 

DDSVRNLFASVKSSQ-SSPIIPGFGGDFNLTLLEPVSISTGSRSARSAIEDLLFDKVTIADPGYMQGYDDCMQQGPASARDLICAQYVAGYKVLPPLMDVNMEAAYTSSLLGSIAG 


★ 


k k »kkkkk»kk • • k k • • • k • k k 


k k k • k k k k k k k • • 


968 

865 

957 


HKU1 

SARS 

MERS 


HR1 


PPWSA-AAGVPFSLNVQYRINGLGVTMDVLNKNQKLIANAFNKALLSIQNGFTATNSALAKIQSWNANAQALNSLLQQLFNKFGAISSSLQEILSRLDNLEAQVQIDRLINGRLTA 

AGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKQIANQFNKAISQIQESLTTTSTALGKLQDWNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQS 

VGWTAGLSSFAAIPFAQSIFYRLNGVGITQQVLSENQKLIANKFNQALGAMQTGFTTTNEAFRKVQDAVNNNAQALSKLASELSNTFGAISASIGDIIQRLDVLEQDAQIDRLINGRLTT 


k k k k k k k k 


k k k k k k k 


k k k k k 


k k k 


k k k k k k k k k 


1084 

985 

1077 


HKU1 

SARS 

MERS 


LNAYVSQQLSDITLIKAGASRAIEKVNECVKSQSPRINFCGNGNHILSLVQNAPYGLLFIHFSYKPTSFKTVLVSPGLCLSGDR—GIAPKQGYFIKQ-NDSWMFTGSSYYYPEPI 

LQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGWFLHVTYVPSQERNFTTAPAICHEGKA YFPREGVFVFN-GTSWFITQRNFFSPQII 

LNAFVAQQLVRSESAALSAQLAKDKVNECVKAQSKRSGFCGQGTHIVSFWNAPNGLYFMHVGYYPSNHIEWSAYGLCDAANPTNCIAPVNGYFIKTNNTRIVDEWSYTGSSFYSPEPI 


k • • • k • k k k 


k k k • k k k k k k k k k • k k • • k • 


k k k • k • k k k • 


k • ★ 


★ • 


k k 


1197 

1097 

1197 


HR2 TM 

HKU1 SDKNWFMNSCSVNFTKAPFIYL—NNSIPNLSDFEAELSLWFKNHTSIAPNLTFNSHINATFLDLYYEMNVIQESIKSLNSSFINLKEIGTYEMYVKWPWYIWLLIVILFIIFLMILFF 1315 
SARS TTDNTFVSGNCDWIGIINNTVYDPLQ—PELDSFKEELDKYFKNHTSPDVDLGDISGINATFLDLYYEMNVIQESIKSLNSSFINLKEIGTYEMYVKWPWYIWLLIVILFIIFLMILFF 1215 
MERS TSLNTKYVAPQVT-YQNISTNLPPPLLGNSTGIDFQDELDEFFKNVSTSIPNFGSLTQINTTLLDLTYEMLSLQQWKALNESYIDLKELGNYTYYNKWPWYIWLGFIAGLVALALCVFF 1316 


k k 


k k k 


k k 


k k k k • k • k • k k 


k k k k k • k k 


-ASHDD— 1351 


HKU1 ICCCTGCGSACFSK—CHNCCDEYGGHNDFVIK-ASHDD— 

SARS LCCMTSCCSCLKGACSCGSCCKFD-EDDSEPVLKGVKLHYT 1255 

MERS ILCCTGCGTNCMGKLKCNRCCDRYEEYDLEPHKVHVH- 1353 

• k k k • 

Extended Data Figure 6 | Sequence alignment of human 
betacoronavirus S proteins. Sequence alignment of S proteins from 
HKU1, SARS-CoV and MERS-CoV using Clustal Omega 52 . Protein 
features described in the text are indicated: N-terminal domain (NTD), 


C-terminal domain (CTD) which contains the large variable loop, the 
S1/S2 and S2' cleavage sites, fusion peptide (FP), heptad repeats 1 and 2 
(HR1, HR2) and transmembrane helix (TM). 
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Extended Data Figure 7 | SI sits atop an adjacent protomer’s S2. a, The 

HKU1 SI subunits are rotated about the trimeric threefold axis relative to 
their corresponding S2 subunits such that the SI CTD from one protomer 
caps the S2 central helix from an adjacent protomer (CTD 1} blue, caps 
S2 2 , red). The third protomer of the trimer has been omitted for clarity, 
b, HKU1 SI CTD (blue) uses a short helix to cap the central helix and 
HR1 (red), c, The influenza haemagglutinin HA2 central helix (red) is 
also capped by a helix in HA1 (blue) 14,28 , d, The S2 N-terminal [3-strand 
is connected to the remainder of the S2 subunit via a loop and an a-helix 


(dotted lines). These regions of the EM density are of insufficient quality 
to confidently build this protein region but enable interpretation of 
connectivity, e, In the pre-fusion HKU1 S protein, the tops of the central 
S2 helices (blue, red, green) are splayed outwards from the threefold axis 
and capped by the SI CTDs (white). The SI NTD, SD-1 and SD-2 have 
been omitted for clarity, f, In the post-fusion six-helix-bundle structure of 
SARS S 22 , the corresponding helical regions from (e) form a well-packed 
three-helix bundle. 
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Extended Data Figure 8 | Class I viral fusion proteins. All class I fusion 
proteins require proteolytic cleavage adjacent to the fusion peptide or loop, 
and the metastable pre-fusion state is triggered by a series of events that 
involve pH change or receptor binding. The post-fusion conformations 
all contain anti-parallel six-helix bundles composed of the HR1 and 
HR2 from the membrane-proximal subunit. However, there is a great 
diversity in pre-fusion conformations as shown here. Members of this 
class that also participate in receptor binding 14-16,28,53 (top row), including 


HeVF 


RSV F 




S glycoproteins of coronaviruses, are organized such that their receptor 
binding subunits sit atop the fusion machinery, and need to be shed in 
order for membrane fusion to proceed. Paramyxovirus F proteins 54-57 
(bottom row) have a different architecture than the capped fusion proteins 
on the top row. The F proteins all have disulfide bonds between the 
membrane proximal and membrane distal subunits, and the two subunits 
remain interconnected throughout the rearrangement process. 
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Extended Data Figure 9 | HKU1 S glycosylation. a, Sites of iV-linked of density in the EM map is observed for 10 sites corresponding to the 

glycosylation on the HKU1 S trimer and b, a single monomer. Of the EndoH-trimmed sugars. Asparagines where glycan density is observed are 

30 potential iV-linked glycosylation sites in a single protomer, the shown as magenta spheres. Asparagines lacking glycan density are shown 

asparagine residues are observed for 21 sites and of these a small portion in green. 
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Extended Data Table 1 | CryoEM data collection, processing and refinement metrics 


Data collection/processing 

Microscope 

Titan Krios 

Voltage (keV) 

300 

Defocus range (pm) 

1.0 to 3.5 

Movies 

1,049 

Frames per movie 

50 

Exposure time per frame (ms) 

200 

Magnification 

22,500x 

Dose rate (e'/pixel/s) 

10 

Total dose per movie (e-/A 2 ) 

57 

Particles 

31,435 

Map Resolution (A) 

4.04 

Model Refinement 

Chimera CC 44 

0.87 

EMRinger Score 48 

2.7 

Mol Probity 49 

1.6 

Clashscore 49 

3.0 

Ramachandran (%) 49 


Favored 

92.1 

Allowed 

7.0 

Outliers 

0.9 


CC = cross correlation 
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